Search CORE

7 research outputs found

Discovery and genotyping of structural variation from long-read haploid genome sequence data

Author: Boitano Matthew
Chaisson Mark J.P.
Chin Chen-Shin
Eichler Evan E
Gordon David
Graves-Lindsay Tina A
Hoekzema Kendra
Huddleston John
Korlach Jonas
Kronenberg Zev N
Munson Katherine M
Peluso Paul
Steinberg Karyn Meltz
Vives Laura
Warren Wes
Wilson Richard K
Publication venue: Digital Commons@Becker
Publication date: 01/01/2016
Field of study

An integrated map of structural variation in 2,504 human genomes

Author: Abyzov Alexej
Alkan Can
Antaki Danny
Auton Adam
Bae Taejeong
Casale Francesco Paolo
Cerveira Eliza
Chaisson Mark J.P.
Chen Jieming
Chen Ken
Chines Peter
Chong Zechen
Dayama Gargi
Fritz Markus His Yang
Gardner Eugene J.
Garrison Erik
Handsaker Robert E.
Hormozdiari Fereydoun
Huddleston John
Jun Goo
Kashin Seva
Konkel Miriam K.
Lam Hugo Y.K.
Malhotra Ankit
Malig Maika
Meiers Sascha
Mu Xinmeng Jasmine
Rausch Tobias
Shi Xinghua
Stütz Adrian M.
Sudmant Peter H.
Walter Klaudia
Ye Kai
Zhang Yan
Publication venue: LSU Digital Commons
Publication date: 30/09/2015
Field of study

© 2015 Macmillan Publishers Limited. All rights reserved. Structural variants are implicated in numerous diseases and make up the majority of varying nucleotides among human genomes. Here we describe an integrated set of eight structural variant classes comprising both balanced and unbalanced variants, which we constructed using short-read DNA sequencing data and statistically phased onto haplotype blocks in 26 human populations. Analysing this set, we identify numerous gene-intersecting structural variants exhibiting population stratification and describe naturally occurring homozygous gene knockouts that suggest the dispensability of a variety of human genes. We demonstrate that structural variants are enriched on haplotypes identified by genome-wide association studies and exhibit enrichment for expression quantitative trait loci. Additionally, we uncover appreciable levels of structural variant complexity at different scales, including genic loci subject to clusters of repeated rearrangement and complex structural variants with multiple breakpoints likely to have formed through individual mutational events. Our catalogue will enhance future studies into structural variant demography, functional impact and disease association

Louisiana State University

Multi-platform discovery of haplotype-resolved structural variation in human genomes

Author: Chaisson Mark J.P.
et al.
Nelson Bradley J.
Sanders Ashley D.
Zhao Xuefang
Publication venue: 'Cold Spring Harbor Laboratory'
Publication date: 01/01/2018
Field of study

The incomplete identification of structural variants from whole-genome sequencing data limits studies of human genetic diversity and disease association. Here, we apply a suite of long- and short-read, strand-specific sequencing technologies, optical mapping, and variant discovery algorithms to comprehensively analyze three human parent-child trios to define the full spectrum of human genetic variation in a haplotype-resolved manner. We identify 818,181 indel variants (<50 bp) and 31,599 structural variants (≥50 bp) per human genome, a seven fold increase in structural variation compared to previous reports, including from the 1000 Genomes Project. We also discovered 156 inversions per genome, most of which previously escaped detection, as well as large unbalanced chromosomal rearrangements. We provide near-complete, haplotype-resolved structural variation for three genomes that can now be used as a gold standard for the scientific community and we make specific recommendations for maximizing structural variation sensitivity for future large-scale genome sequencing studies

Repository for Publications and Research Data

Discovery and genotyping of structural variation from long-read haploid genome sequence data

Author: Chen-Shin Chin
David Gordon
Evan E. Eichler
John Huddleston
Jonas Korlach
Karyn Meltz Steinberg
Katherine M. Munson
Kendra Hoekzema
Laura Vives
Mark J.P. Chaisson
Matthew Boitano
Paul Peluso
Richard K. Wilson
Tina A. Graves-Lindsay
Wes Warren
Zev N. Kronenberg
Publication venue: 'Cold Spring Harbor Laboratory'
Publication date
Field of study

Crossref

Towards complete and error-free genome assemblies of all vertebrate species

Author: Balakrishnan Christopher N.
Biegler Matthew T.
Bista Iliana
Burt Dave
Cantin Lindsey J.
Chaisson Mark
Chow William
Damas Joana
Detrich H. William III
Digby Andrew
Eason Daryl
Edwards Taylor
Fedrigo Olivier
Formenti Giulio
Franchini Paolo
Fungtammasan Arkarachai
Gedman Gregory L.
George Julia M.
Gilbert Marcus Thomas Pius
Grutzner Frank
Haase Bettina
Haggerty Leanne
Howard Jason
Iorns David
Jarvis Erich D.
Kautt Andreas F.
Kim Juwan
Ko Byung June
Koren Sergey
Lama Tanya M.
Lee Chul
Malinsky Milan
McCarthy Shane A.
Meyer Axel
Mountcastle Jacquelyn
Naylor Gavin J.P.
Paez Sadye
Pippel Martin
Rhie Arang
Robertson Bruce
Smith Michelle
Svardal Hannes
Thibaud-Nissen Francoise
Turner George
Uiano-Silva Marcela
Vernes Sonja C.
Wagner Maximillian
Warren Wesley C.
Wilkinson Mark
Winkler Sylke
Publication venue: 'Nature Research Society'
Publication date: 01/01/2021
Field of study

High-quality and complete reference genome assemblies are fundamental for the application of genomics to biology, disease, and biodiversity conservation. However, such assemblies are available for only a few non-microbial species1,2,3,4. To address this issue, the international Genome 10K (G10K) consortium5,6 has worked over a five-year period to evaluate and develop cost-effective methods for assembling highly accurate and nearly complete reference genomes. Here we present lessons learned from generating assemblies for 16 species that represent six major vertebrate lineages. We confirm that long-read sequencing technologies are essential for maximizing genome quality, and that unresolved complex repeats and haplotype heterozygosity are major sources of assembly error when not handled correctly. Our assemblies correct substantial errors, add missing sequence in some of the best historical reference genomes, and reveal biological discoveries. These include the identification of many false gene duplications, increases in gene sizes, chromosome rearrangements that are specific to lineages, a repeated independent chromosome breakpoint in bat genomes, and a canonical GC-rich pattern in protein-coding genes and their regulatory regions. Adopting these lessons, we have embarked on the Vertebrate Genomes Project (VGP), an international effort to generate high-quality, complete reference genomes for all of the roughly 70,000 extant vertebrate species and to help to enable a new era of discovery across the life sciences

NORA - Norwegian Open Research Archives

An integrated map of structural variation in 2,504 human genomes

Author: Abyzov Alexej
Alkan Can
Antaki Danny
Auton Adam
Bae Taejeong
Bashir Ali
Batzer Mark A.
Casale Francesco Paolo
Cerveira Eliza
Chaisson Mark J.P.
Chen Jieming
Chen Ken
Chines Peter
Chong Zechen
Clarke Laura
Dal Elif
Dayama Gargi
Devine Scott E.
Ding Li
Eichler Evan E.
Emery Sarah
Fan Xian
Flicek Paul
Fritz Markus Hsi-Yang
Gardner Eugene J.
Garrison Erik
Gerstein Mark B.
Gibbs Richard A.
Gujral Madhusudan
Handsaker Robert E.
Hormozdiari Fereydoun
Huddleston John
Jun Goo
Kahveci Fatma
Kashin Seva
Kidd Jeffrey M.
Kong Yu
Konkel Miriam K.
Korbel Jan O.
Lam Hugo Y. K.
Lameijer Eric-Wubbo
Lee Charles
Malhotra Ankit
Malig Maika
Marth Gabor
Mason Christopher E.
McCarroll Steven A.
McCarthy Shane
Meiers Sascha
Menelaou Androniki
Mills Ryan E.
Mu Xinmeng Jasmine
Muzny Donna M.
Nelson Bradley J.
Noor Amina
Parrish Nicholas F.
Pendleton Matthew
Quitadamo Andrew
Raeder Benjamin
Rausch Tobias
Romanovitch Mallory
Schadt Eric E.
Schlattl Andreas
Sebat Jonathan
Sebra Robert
Shabalin Andrey A.
Shi Xinghua
Stegle Oliver
Stütz Adrian M.
Sudmant Peter H.
Untergasser Andreas
Walker Jerilyn A.
Walter Klaudia
Wang Min
Ye Kai
Yu Fuli
Zhang Chengsheng
Zhang Jing
Zhang Yan
Zheng-Bradley Xiangqun
Zhou Wanding
Zichner Thomas
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/10/2015
Field of study

Summary Structural variants (SVs) are implicated in numerous diseases and make up the majority of varying nucleotides among human genomes. Here we describe an integrated set of eight SV classes comprising both balanced and unbalanced variants, which we constructed using short-read DNA sequencing data and statistically phased onto haplotype-blocks in 26 human populations. Analyzing this set, we identify numerous gene-intersecting SVs exhibiting population stratification and describe naturally occurring homozygous gene knockouts suggesting the dispensability of a variety of human genes. We demonstrate that SVs are enriched on haplotypes identified by genome-wide association studies and exhibit enrichment for expression quantitative trait loci. Additionally, we uncover appreciable levels of SV complexity at different scales, including genic loci subject to clusters of repeated rearrangement and complex SVs with multiple breakpoints likely formed through individual mutational events. Our catalog will enhance future studies into SV demography, functional impact and disease association

The Jackson Laboratory: The Mouseion at the JAXlibrary

Harvard University - DASH

PubMed Central

eScholarship - University of California

A draft human pangenome reference

Author: Abel Haley J.
Abou Tayoun Ahmad N.
Antonacci-Fulton Lucinda L.
Asri Mobin
Baid Gunjan
Baker Carl A.
Belyaeva Anastasiya
Billis Konstantinos
Bourque Guillaume
Buonaiuto Silvia
Carroll Andrew
Chaisson Mark J.P.
Chang Pi Chuan
Chang Xian H.
Cheng Haoyu
Chu Justin
Cody Sarah
Colonna Vincenza
Cook Daniel E.
Cook-Deegan Robert M.
Cornejo Omar E.
Diekhans Mark
Doerr Daniel
Ebert Peter
Ebler Jana
Eichler Evan E.
Eizenga Jordan M.
Fairley Susan
Fedrigo Olivier
Felsenfeld Adam L.
Feng Xiaowen
Fischer Christian
Flicek Paul
Formenti Giulio
Frankish Adam
Fulton Robert S.
Gao Yan
Garg Shilpa
Garrison Erik
Garrison Nanibaa’ A.
Giron Carlos Garcia
Green Richard E.
Groza Cristian
Guarracino Andrea
Haggerty Leanne
Hall Ira M.
Harvey William T.
Haukness Marina
Haussler David
Heumos Simon
Hickey Glenn
Hoekzema Kendra
Hourlier Thibaut
Howe Kerstin
Jain Miten
Jarvis Erich D.
Ji Hanlee P.
Kenny Eimear E.
Koenig Barbara A.
Kolesnikov Alexey
Korbel Jan O.
Kordosky Jennifer
Koren Sergey
Lee Ho Joon
Lewis Alexandra P.
Li Heng
Liao Wen Wei
Lu Shuangjia
Lu Tsung Yu
Lucas Julian K.
Magalhães Hugo
Marco-Sola Santiago
Marijon Pierre
Markello Charles
Marschall Tobias
Martin Fergal J.
McCartney Ann
McDaniel Jennifer
Miga Karen H.
Mitchell Matthew W.
Monlong Jean
Mountcastle Jacquelyn
Munson Katherine M.
Mwaniki Moses Njagi
Nattestad Maria
Novak Adam M.
Nurk Sergey
Olsen Hugh E.
Olson Nathan D.
Paten Benedict
Pesout Trevor
Phillippy Adam M.
Popejoy Alice B.
Porubsky David
Prins Pjotr
Puiu Daniela
Rautiainen Mikko
Regier Allison A.
Rhie Arang
Sacco Samuel
Sanders Ashley D.
Schneider Valerie A.
Schultz Baergen I.
Shafin Kishwar
Sibbesen Jonas A.
Sirén Jouni
Smith Michael W.
Sofia Heidi J.
Thibaud-Nissen Françoise
Tomlinson Chad
Tricomi Francesca Floriana
Villani Flavia
Vollger Mitchell R.
Wagner Justin
Walenz Brian
Wang Ting
Wood Jonathan M.D.
Zimin Aleksey V.
Zook Justin M.
Publication venue
Publication date: 01/01/2023
Field of study

Here the Human Pangenome Reference Consortium presents a first draft of the human pangenome reference. The pangenome contains 47 phased, diploid assemblies from a cohort of genetically diverse individuals 1. These assemblies cover more than 99% of the expected sequence in each genome and are more than 99% accurate at the structural and base pair levels. Based on alignments of the assemblies, we generate a draft pangenome that captures known variants and haplotypes and reveals new alleles at structurally complex loci. We also add 119 million base pairs of euchromatic polymorphic sequences and 1,115 gene duplications relative to the existing reference GRCh38. Roughly 90 million of the additional base pairs are derived from structural variation. Using our draft pangenome to analyse short-read data reduced small variant discovery errors by 34% and increased the number of structural variants detected per haplotype by 104% compared with GRCh38-based workflows, which enabled the typing of the vast majority of structural variant alleles per sample.</p

Online Research Database In Technology